Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not DNAT packets from WSL2's loopback0 #48075

Merged
merged 1 commit into from
Sep 17, 2024

Conversation

robmry
Copy link
Contributor

@robmry robmry commented Jun 27, 2024

- What I did

When running WSL2 with mirrored mode networking, add an iptables rule to skip DNAT for packets arriving on interface loopback0 that are addressed to a localhost address - they're from the Windows host.

WSL2's mirrored mode networking is outlined here.

- How I did it

Detect WSL2 mirrored mode by the presence of interface loopback0, and (inspired by this workaround linked from the WSL ticket) /usr/bin/wslinfo --networking-mode reporting mirrored, see wslinfo release note.

If needed, create a rule in the nat-DOCKER chain to return early for packets arriving on loopback0 for 127.0.0.0/8.

There's no IPv6 rule, because WSL2 mirrored mode doesn't support it.

- How to verify it

As described on the ticket, with docker-ce installed in an instance of Linux (Ubuntu) running under WSL2 with networkingMode=mirrored - run an nginx container with -p 8080:80, check that the Windows host can connect to it via http://localhost:8080.

Also checked that the new iptables rule is not created unless it's needed.

Access from Linux to a service running on the Windows localhost address worked before and after this change.

(--userland-proxy=true, the default, is required for this to work.)

New unit test, just to check the conditions for adding the rule.

- Description for the changelog

Support WSL2 mirrored-mode networking's use of interface `loopback0` for packets from the Windows host.

@robmry robmry added status/1-design-review kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/networking area/lcow Issues and PR's related to the experimental LCOW feature area/networking/firewalling area/networking/d/bridge labels Jun 27, 2024
@robmry robmry self-assigned this Jun 27, 2024
func mirroredWSL2() bool {
if _, err := netlink.LinkByName("loopback0"); err != nil {
if !errors.As(err, &netlink.LinkNotFoundError{}) {
log.G(context.TODO()).Warnf("Failed to check for WSL interface: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit; can you use WithError() or perhaps even WithFields() in case we want to log the name of the interface we're looking for?

Suggested change
log.G(context.TODO()).Warnf("Failed to check for WSL interface: %v", err)
log.G(context.TODO()).WithError(err).Warn("Failed to check for WSL interface")
Suggested change
log.G(context.TODO()).Warnf("Failed to check for WSL interface: %v", err)
log.G(context.TODO()).WithFields(log.Fields{"error": err, "interface": "loopback0"}).Warn("Failed to check for WSL interface")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes - I've split out the err, thank you.

I didn't feel the need to report the loopback0 interface name ... it might be in the netlink message anyway, but the idea was just to log a failed netlink call (having ignored LinkNotFoundError).

return false
}
output, err := exec.Command(wslinfoPath, "--networking-mode", "-n").CombinedOutput()
log.G(context.TODO()).Debugf("wslinfo --networking-mode:%s err:%v", string(output), err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, at least for the error, but perhaps output could also be a field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've split out err.

ipv: iptables.IPv4,
table: iptables.Nat,
chain: DockerChain,
args: []string{"-i", "loopback0", "-d", "127.0.0.0/8", "-j", "RETURN"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As (even more with my suggestions), the magic loopback0 interface will be used more now, I'm wondering it would make sense to define a const for it as well; at least it would allow documenting "what's this magic loopback0?

(downside is that it's less grep'able for loopback0 in the code, although the const can still be a good starting point I guess)

So, yeah, not a "strong" opinion one way of another 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it with a const name ... but I don't think it made things any clearer, so put it back.

The gigantic comment just above explains what it is.

Comment on lines 547 to 549
func mirroredWSL2Workaround(ipv iptables.IPVersion) error {
// WSL2 does not (currently) support Windows<->Linux communication via ::1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that we don't check if wslinfoPath (/usr/bin/wslinfo) exists, which (I think) is the way to detect if we're actually running on WSL2?

  • Should we have a detectWSL2() utility to verify if we're running on WSL2, as this workaround is not for other situations
  • Perhaps that check should be a sync.Once or syncOnceValue if we only need to check if once (but may not be needed if we only run this code once 😂)

For testing, maybe we need something to override the detection 🤔 (not sure?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that we don't check if wslinfoPath (/usr/bin/wslinfo) exists

There's no need to: the exec.Command call would fail with ENOENT if the path does not exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed - but it's gone now anyway.

Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running /usr/bin/wslinfo as root is not ideal, especially on non-WSL environments. Some ideas:

  • Run wslinfo as an unprivileged user. But which user?
  • Sandbox the wslinfo process in a container
  • More sanity checks to limit the exposure to non-WSL users:
    • Test that /run/WSL exists and is a directory
    • Test that both /usr/bin and /usr/bin/wslinfo are owned by root and not world-writable?

// mirroredWSL2 returns true if the host Linux appears to be running under
// Windows WSL2 with mirrored mode networking. If a loopback0 device exists, it
// checks that /usr/bin/wslinfo is executable and reports mirrored networking.
func mirroredWSL2() bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use sync.Once and cache the answer? Can the networking mode change dynamically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only happens once, when the DOCKER chain etc are set up.

As far as I can tell, the mode can't change dynamically ... I guess it's a bit of a major change under Linux's feet. New config seems to take effect on a guest-Linux restart.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: WSL cannot change network modes dynamically - it's set when we start the VM. (as it has impacts beyond just the Linux configuration, but also the host vswitch and tcpip stack configuration).

@robmry
Copy link
Contributor Author

robmry commented Jun 27, 2024

Running /usr/bin/wslinfo as root is not ideal, especially on non-WSL environments. Some ideas:

  • Run wslinfo as an unprivileged user. But which user?

  • Sandbox the wslinfo process in a container

  • More sanity checks to limit the exposure to non-WSL users:

    • Test that /run/WSL exists and is a directory
    • Test that both /usr/bin and /usr/bin/wslinfo are owned by root and not world-writable?

Yes - I think we could just use the presence of loopback0 and wslinfo to infer that it's wsl2 and mirrored networking.

If we're wrong, which seems unlikely (unless it's deliberate) ... we create the rule to skip DNAT for packets from that interface addressed to 127.0.0.0/8 - there won't be any but, even if there were, they'd only be DNAT'd if they were for a mapped port ... in which case they don't want DNAT either.

(But I've asked the Microsoft folks if there's an API equivalent anyway.)

@corhere
Copy link
Contributor

corhere commented Jun 27, 2024

If we're wrong, which seems unlikely (unless it's deliberate)

That's what I'm worried about: some attacker tricking dockerd into executing attacker-controlled code as root.

@robmry
Copy link
Contributor Author

robmry commented Jun 27, 2024

If we're wrong, which seems unlikely (unless it's deliberate)

That's what I'm worried about: some attacker tricking dockerd into executing attacker-controlled code as root.

Yes, agreed (that's why I raised the issue on our call). I'm suggesting we don't run it ... just see if it and loopback0 are there. There aren't really any consequences if we infer wrongly.

@thaJeztah
Copy link
Member

thaJeztah commented Jun 28, 2024

Let me post the strace that Tianon posted on slack; does that output mean that there is some API it's using to detect? (I noticed some AF_UNIX and /run/WSL/1_interop); i.e., is that something that could be used for this purpose as alternative to exec'ing a binary?

socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/run/WSL/1_interop"}, 110) = 0
$ strace -ff wslinfo --networking-mode
execve("/usr/bin/wslinfo", ["wslinfo", "--networking-mode"], 0x7fff40ffd070 /* 36 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x40cc60)       = 0
set_tid_address(0x40cc00)               = 6710
gettid()                                = 6710
gettid()                                = 6710
gettid()                                = 6710
gettid()                                = 6710
brk(NULL)                               = 0x1b8e000
brk(0x1b90000)                          = 0x1b90000
mmap(0x1b8e000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x1b8e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a11000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a10000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a0f000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a0e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a0d000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a0c000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a0b000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a0a000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a09000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a07000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f62e9a05000
sched_getaffinity(0, 128, [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]) = 32
getpid()                                = 6710
getpid()                                = 6710
uname({sysname="Linux", nodename="catra", ...}) = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/run/WSL/1_interop"}, 110) = 0
write(3, "\32\0\0\0\10\0\0\0", 8)       = 8
poll([{fd=3, events=POLLIN}], 1, 10000) = 1 ([{fd=3, revents=POLLIN|POLLHUP}])
read(3, "\1\0\0\0", 4)                  = 4
close(3)                                = 0
ioctl(1, TIOCGWINSZ, {ws_row=24, ws_col=80, ws_xpixel=0, ws_ypixel=0}) = 0
writev(1, [{iov_base="nat", iov_len=3}, {iov_base="\n", iov_len=1}], 2nat
) = 4
munmap(0x7f62e9a05000, 8192)            = 0
munmap(0x7f62e9a07000, 8192)            = 0
munmap(0x7f62e9a09000, 4096)            = 0
munmap(0x7f62e9a0b000, 4096)            = 0
munmap(0x7f62e9a0c000, 4096)            = 0
munmap(0x7f62e9a0d000, 4096)            = 0
munmap(0x7f62e9a0e000, 4096)            = 0
munmap(0x7f62e9a0f000, 4096)            = 0
munmap(0x7f62e9a10000, 4096)            = 0
munmap(0x7f62e9a11000, 4096)            = 0
exit_group(0)                           = ?
+++ exited with 0 +++

Only reference I found to that on GitHub was this; https://github.com/nullpo-head/wsl-distrod/blob/c3181f4e49566a1d92988b71487716f0dceddd1e/distrod/distrod/tests/test_runner.sh#L133-L134

@robmry robmry force-pushed the wsl2_mirrored_loopback0_workaround branch from 37e8f55 to 1cd6cb3 Compare June 28, 2024 18:57
_, _, _, _, err := setupIPChains(configuration{EnableIPTables: true}, iptables.IPv4)
assert.NilError(t, err)
assert.Check(t, mirroredWSL2Rule().Exists() == tc.expLoopback0Rule,
fmt.Sprintf("Expected exists=%v", tc.expLoopback0Rule))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everyone forgets that assert.Check takes a format string and arguments.

Suggested change
fmt.Sprintf("Expected exists=%v", tc.expLoopback0Rule))
"Expected exists=%v", tc.expLoopback0Rule)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I didn't spot it this time; good catch.

Using is.Equal(mirroredWSL2Rule().Exists(), tc.expLoopback0Rule) will also do all of that; it prints the variable names in failure, so if those are descriptive, it will be printing something like;

    foo_test.go:35: assertion failed: false (bool) != true (tc.expLoopback0Rule bool)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swapped for is.Equal.

Comment on lines 569 to 572
if stat, err := os.Stat(wslinfoPath); err == nil {
return stat.Mode().IsRegular() && (stat.Mode().Perm()&0111) != 0
}
return false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the hit to readability doesn't seem worth saving one line vertically here. https://go.dev/wiki/CodeReviewComments#indent-error-flow

Suggested change
if stat, err := os.Stat(wslinfoPath); err == nil {
return stat.Mode().IsRegular() && (stat.Mode().Perm()&0111) != 0
}
return false
stat, err := os.Stat(wslinfoPath)
if err != nil {
return false
}
return stat.Mode().IsRegular() && (stat.Mode().Perm()&0111) != 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 382 to 386
// Avoid overwriting a real "/usr/bin/wslinfo" (and clashing with a real
// loopback0).
if _, err := os.Stat(wslinfoPath); err == nil {
t.Skip("Skipping test because " + wslinfoPath + " exists")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a job for namespaces! We don't have to worry about clashing with a real loopback0 because the test cases are invoked in an unshared network namespace thanks to netnsutils.SetupTestOSContext(). While we don't have a ready-made test utility for unsharing a mount namespace and setting things up such that parts of the host filesystem can be mutated such that the mutations are only visible to the test, we do have all the building blocks. The general idea is to over-mount a path with an overlayfs with the same path as the lowerdir, and a tmpfs path as the upper:

root@56e9443a7364:~# mkdir /tmp/upper /tmp/work
root@56e9443a7364:~# unshare --mount-proc --propagation slave bash
root@56e9443a7364:~# mount -t overlay -olowerdir=/usr/bin,upperdir=/tmp/upper,workdir=/tmp/work overlay /usr/bin
root@56e9443a7364:~# rm /usr/bin/ls
root@56e9443a7364:~# ls -ld
bash: ls: command not found
root@56e9443a7364:~# exit
exit
root@56e9443a7364:~# ls -ld
drwx------ 1 root root 4096 Jun 28 23:05 .

...it may be too much to tack onto this PR, though. Save for a follow-up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to delete that check when I changed wslinfoPath from const to var so that the test could modify it to avoid the clash that way instead. It's gone now.

@robmry robmry force-pushed the wsl2_mirrored_loopback0_workaround branch from 1cd6cb3 to 708b1e6 Compare July 1, 2024 08:52
@dannyhpy
Copy link

dannyhpy commented Jul 2, 2024

This PR fixes the issue listed in microsoft/WSL#10494 for the loopback interface only, sadly. Running docker run -ti --rm -p 192.168.1.100:8080:80 traefik/whoami and opening http://192.168.1.100:8080/ in the browser on the host machine still fails with a ERR_CONNECTION_TIMED_OUT. On other hosts in the LAN, though, this works.

The hostAddressLoopback option is set to true in my %USERPROFILE%\.wslconfig file as noted in https://learn.microsoft.com/en-us/windows/wsl/wsl-config and running another program such as python -m http.server works and is reachable from the Windows host at http://192.168.1.100:8000/

@robmry
Copy link
Contributor Author

robmry commented Jul 3, 2024

This PR fixes the issue listed in microsoft/WSL#10494 for the loopback interface only, sadly. Running docker run -ti --rm -p 192.168.1.100:8080:80 traefik/whoami and opening http://192.168.1.100:8080/ in the browser on the host machine still fails with a ERR_CONNECTION_TIMED_OUT. On other hosts in the LAN, though, this works.

Hi @dannyhpy - this seems to be unrelated to this loopback0 workaround, please could you raise it as a separate issue?

I took a look, and could repro ...

Packets arrive on the WSL2 guest's ethN interface, with the shared host address as the source. So, everything looks normal. They're DNAT'd, arrive at the container, and get a response addressed to the shared host address (192.168.1.100 in the example). But, the response packets don't get back to the WSL2 ethN interface.

That's because the dest MAC address in response packets belongs to the docker network's bridge (the container's default gateway). The WSL2 guest has that IP address itself, on its own ethN, it doesn't know anything about the Windows version of that address. So, packets are delivered locally rather than sent back to ethN - then dropped because nothing in the Linux guest is listening on that dest port.

I don't have an idea for a workaround right now. Maybe somehow forcing use of docker-proxy for these connections would help, but it sounds messy. It'd be good to get some input from Microsoft on how they think this sort of thing could work.

@dannyhpy
Copy link

dannyhpy commented Jul 4, 2024

Hi @robmry,

Thank you for your very detailled explanations. I originally thought this issue was similar to the loopback0 issue and wrongly assumed it could be solved using the same workaround. My apologies for my off-topic comment.

please could you raise it as a separate issue?

I'm afraid I won't be as capable as you are to properly describe the issue.

@robmry
Copy link
Contributor Author

robmry commented Jul 5, 2024

Oh, no problem - thank you for raising the issue ... I've created #48136.

@louyongjiu
Copy link

Hope this pull request can be merged promptly, as I have been troubled by this issue for quite some time.

@shigenobuokamoto
Copy link

will mirroredWSL2Rule() still be called if userland-proxy=false ?
if userland-proxy=false then there is no process to receive the traffic so i would like you to leave DNAT as is.

@robmry
Copy link
Contributor Author

robmry commented Aug 13, 2024

will mirroredWSL2Rule() still be called if userland-proxy=false ? if userland-proxy=false then there is no process to receive the traffic so i would like you to leave DNAT as is.

Hi @shigenobuokamoto - I think we could do that ... how will it help? Could you describe the use-case?

@shigenobuokamoto
Copy link

@robmry thank you so much.

WSL 2.3.11 adds some improvements to communicating with Windows hosts.
this makes it possible to access Docker network from Windows Host using nftables.
however, if DNAT is disabled, this will no longer work.

microsoft/WSL#10494 (comment)

https://gist.github.com/shigenobuokamoto/540c5f09a03eb07149501e99a6c8d82b

@robmry
Copy link
Contributor Author

robmry commented Aug 13, 2024

Ah, excellent - thank you!

@robmry
Copy link
Contributor Author

robmry commented Aug 29, 2024

@robmry thank you so much.

WSL 2.3.11 adds some improvements to communicating with Windows hosts. this makes it possible to access Docker network from Windows Host using nftables. however, if DNAT is disabled, this will no longer work.

microsoft/WSL#10494 (comment)

https://gist.github.com/shigenobuokamoto/540c5f09a03eb07149501e99a6c8d82b

I've added the exception for running without userland-proxy. In testing it, I found with the latest WSL2 pre-release (2.3.17 but, presumably, anything >2.3.11) no workaround in docker, and no extra WSLPOSTROUTING nftables rule, it seemed to work fine with userland-proxy disabled.

@shigenobuokamoto
Copy link

@robmry, thank you for dealing with this matter.

desc string
loopback0 bool
userlandProxy bool
wslinfoExists bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wslinfoExists bool

When running WSL2 with mirrored mode networking, add an iptables
rule to skip DNAT for packets arriving on interface loopback0 that
are addressed to a localhost address - they're from the Windows
host.

Signed-off-by: Rob Murray <[email protected]>
@robmry robmry force-pushed the wsl2_mirrored_loopback0_workaround branch from 9600e1a to f9c0103 Compare September 16, 2024 08:31
Copy link

@keith-horton keith-horton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% fluent in GO - but this looks good to me! (from the wsl side)

@robmry robmry merged commit d89eaad into moby:master Sep 17, 2024
133 checks passed
@robmry robmry deleted the wsl2_mirrored_loopback0_workaround branch September 17, 2024 08:09
@robmry robmry removed the area/lcow Issues and PR's related to the experimental LCOW feature label Sep 17, 2024
@thaJeztah thaJeztah mentioned this pull request Sep 17, 2024
@thaJeztah thaJeztah added this to the 28.0.0 milestone Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking/d/bridge area/networking/firewalling area/networking kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny process/cherry-picked
Projects
None yet
10 participants